Unstructured data (or unstructured information) refers to information that either does not have a pre-defined data model or is not organized in a pre-defined manner. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. This results in irregularities and ambiguities that make it difficult to understand using traditional programs as compared to data stored in fielded form in databases or annotated (semantically tagged) in documents.
Document schemas are the highest level of the metadata structure associated with a document file. They allow a user to control and manipulate the documents or files that are added into a database. Document schemas are a way to group or otherwise associate like files together even when they are filed in disparate places across several databases. Document schemas manage how files are added to the documents and what information is collected about them via the metadata.